Practical Entropy-Compressed Rank/Select Dictionary
نویسندگان
چکیده
Rank/Select dictionaries are data structures for an ordered set S f0; 1; : : : ; n 1g to compute rank(x; S) (the number of elements in S which are no greater than x), and select(i; S) (the i-th smallest element in S), which are the fundamental components of succinct data structures of strings, trees, graphs, etc. In those data structures, however, only asymptotic behavior has been considered and their performance for real data is not satisfactory. In this paper, we propose novel four Rank/Select dictionaries, esp, recrank, vcode and sdarray, each of which is small if the number of elements in S is small, and indeed close to nH0(S) (H0(S) 1 is the zero-th order empirical entropy of S) in practice, and its query time is superior to the previous ones. Experimental results reveal the characteristics of our data structures and also show that these data structures are superior to existing implementations in both size and query time.
منابع مشابه
A Compressed-Gap Data-Aware Measure for Indexable Dictionaries
We consider the problem of building a compressed fully-indexable dictionary over a set S of n items out of a universe U = {0, ..., u − 1}. We use gap-encoding combined with entropy compression in order to reduce the space of our structures. Let H 0 be the zero-order empirical entropy of the gap stream. We observe that nH 0 ∈ o(gap) if the gaps are highly compressible, and prove that nH 0 ≤ n lo...
متن کاملHigh-Order Entropy Compressed Bit Vectors with Rank/Select
We design practical implementations of data structures for compressing bit-vectors to support efficient rank-queries (counting the number of ones up to a given point). Unlike previous approaches, which either store the bit vectors plainly, or focus on compressing bit-vectors with low densities of ones or zeros, we aim at low entropies of higher order, for example 101010 . . . 10. Our implementa...
متن کاملPractical Rank/Select Queries over Arbitrary Sequences
We present a practical study on the compact representation of sequences supporting rank, select, and access queries. While there are several theoretical solutions to the problem, only a few have been tried out, and there is little idea on how the others would perform, especially in the case of sequences with very large alphabets. We first present a new practical implementation of the compressed...
متن کاملRank and select: Another lesson learned
Rank and select queries on bitmaps are essential building bricks of many compressed data structures, including text indexes, membership and range supporting spatial data structures, compressed graphs, and more. Theoretically considered yet in 1980s, these primitives have also been a subject of vivid research concerning their practical incarnations in the last decade. We present a few novel rank...
متن کاملAlphabet Partitioning for Compressed Rank/Select and Applications
We present a data structure that stores a string s[1..n] over the alphabet [1..σ] in nH0(s) + o(n)(H0(s)+1) bits, where H0(s) is the zero-order entropy of s. This data structure supports the queries access and rank in time O (lg lg σ), and the select query in constant time. This result improves on previously known data structures using nH0(s) + o(n lg σ) bits, where on highly compressible insta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/cs/0610001 شماره
صفحات -
تاریخ انتشار 2007